In this paper, we present an OpenCL-based heterogeneous implementation of acomputer vision algorithm -- image inpainting-based object removal algorithm --on mobile devices. To take advantage of the computation power of the mobileprocessor, the algorithm workflow is partitioned between the CPU and the GPUbased on the profiling results on mobile devices, so that thecomputationally-intensive kernels are accelerated by the mobile GPGPU(general-purpose computing using graphics processing units). By exploring theimplementation trade-offs and utilizing the proposed optimization strategies atdifferent levels including algorithm optimization, parallelism optimization,and memory access optimization, we significantly speed up the algorithm withthe CPU-GPU heterogeneous implementation, while preserving the quality of theoutput images. Experimental results show that heterogeneous computing based onGPGPU co-processing can significantly speed up the computer vision algorithmsand makes them practical on real-world mobile devices.
展开▼